Cloud Computing Vs Physical Infrastructure
This article is a translation of a post by Frédéric Faure (system architect at Ysance) on the differences between using a cloud computing infrastructure such as AWS and building your own physical one. We report it because we believe the article is very correct.
I have noticed many questions about the differences inherent in the choice
between a Cloud infrastructure such as AWS (Amazon Web Services) and a
traditional physical infrastructure. Firstly, there are a number of
preconceived notions on this topic that I will try to decode for you.
Then, it must be understood that each infrastructure has its advantages and
Disadvantages: A typical cloud infrastructure does not necessarily have to meet your requirements
requirements in any case, however, must be able to meet some of them
By optimizing or facilitating the functionalities offered by a traditional physical infrastructure. I will then demonstrate the differences I have noticed between the two methods, in order to help you to make up your own mind.
The Framework
Cloud
There are different types of possibilities for using the Cloud and I will continue by discussing AWS which are
infrastructure-oriented services, rather than Google-type services (GAE – Google App Engine),
to name one, which provides an execution environment for your web applications developed with
the APIs provided (similar to a framework). In reality, as far as the latter are concerned, we cannot speak for customers (those who use the credit card) about the management of the infrastructure, because we load our application using the APIs provided and leave the entire management of the infrastructure to the service provider. This does not mean that it is a minor cloud computing service, but simply another type of Cloud service, more oriented towards the PaaS of infrastructure-oriented.
Physical Infrastructure
As far as physical infrastructures are concerned, I will examine the concept of self-hosted infrastructures and at the same time the notion of infrastructure maintained by a hosting provider. Similarly, I will look at infrastructures based directly on hardware, as well as those based on virtualized environments . Cloud computing is also based on virtualization, but we’re not that interested in that technology here, rather the way it is delivered to customers (you). In fact, you can simply launch instances via a console, as you do for EC2, if you have an ESX (VMware), for example, but it’s “just” a hypervisor that partitions the physical server into multiple virtual machines. You will still have to deal with the purchase of equipment (blades, etc.), the configuration of the network, etc. But we’ll come back to those details later.
Cloud computing = Systems administrators marked down?
Yes, sales are on! Are you looking for a sweater, a jacket, … a systems administrator? It often happens to meet people who think that the Cloud (in the case of AWS) will allow them to enable them without an experienced system administrator, and to build an infrastructure with less expertise. The answer is obvious: WRONG!
Maybe a clever sales pitch can convince you that the various services are so user-friendly, that you can do it all yourself, and that the pre-packaged AMIs (Amazon Machine Image) will make life easy, but it goes without saying that once you launch your EC2 instances, you connect to computers (SSH / port 22 for Linux and TSE / port 3.389 for Windows), then you will have to set parameters, do the tuning, etc.
What is true for system administrators in front of AWS is also true for system architects in the field of cloud computing services that give access to the highest layers (PaaS such as Google App Engine). A person with experience in the field, able to configure infrastructure requirements is necessary: the tool may change but the skills must still be available. Note, however, that if you use
GAE, you don’t need a systems administrator for the application. If the Cloud Service Editor offers a service in a given tier (Haas, IAAS, PaaS, etc.), there is no longer a need for staff to manage the lower layers. However, we accept the Framework provided by the Cloud provider.
System administrators cannot be eliminated, but their role is changing: they are becoming more than a developer. In fact, being able to pull resources on the fly will allow a programmable and automatable infrastructure management through scripts, which will call the APIs provided by Amazon that allows communication with its web services. Everything on Amazon is webServices: EC2, EBS, SQS, S3, SimpleDB: the only non-SOAP or REST operation that exists is when connecting directly to the EC2 instances that we have invoked through the WebService or when the EC2 instances talk to the EBS that we have invoked via the … I’ll leave you guessing.
The administrator can then, rather than go to the server room, add a disk, connect a server (in the case of a physical architecture) or pick up the phone and ask the host provider to do it (have a coffee … call again… take a Xanax or a Prozac pill…), request resources through a script in Ruby or Python… So you can automate the Cloud infrastructure and much, much more, in addition, with a series of scripts and tools.
The profession of the systems administrator is therefore evolving between a physical infrastructure and a Cloud infrastructure such as AWS: he will always be more than a developer. But system administrators nevertheless remain essential.
Elastic is great!
As I said before, one of the essential differences between the two types of infrastructures is the flexibility and dynamism provided by the Cloud solution, compared to a traditional physical architecture (whether it is based on virtualization or not). This means the elimination of time
Necessary to install and configure logistics (purchase of equipment, installation of the operating system, connection to the network – physical network and configuration of interfaces, etc.) Similarly, when
you no longer need a piece of hardware (an EC2 virtual instance, an EBS volume, an S3 object, etc.), you can return it to the resource pool: it will be reinitialized so that none of your data can be recovered, and
made available again until the next call to the web service.
You also have full access to certain elements such as the security groups (firewalls) set for each instance… And this is very useful. It’s very practical, especially compared to a traditional hosting provider: remember the last time you had to change your firewall rules?
But it’s not just about the pros and cons of buying the server compared to running instances. AWS services are supported by data centers that are already industrially organized and tested. All safety rules
which must be met in terms of fire protection, cooling of the systems, electrical redundancy of the power supply, physical security against break-ins, distribution of the hardware on 2 or more
Physical datacenters for disaster recovery, etc. involve an initial and colossal investment, even when everything is installed, you still won’t be able to recreate the same quality within your company
(99% of the time, in any case). You can achieve all of this though, or part of it, with a traditional hosting provider.
But there are also all software-like services, such as the duration of data management (Redundancy/Replication, on EBS and on S3), accessibility or high availability, hardware monitoring (to receive an alert when physical components show signs of weakening), procedures
for breakdowns, etc. I’ll let you read the 9 Principles of S3 (French version), so as to understand how much concepts are included. You won’t achieve all that with a traditional hosting provider (and forget you @Home). The quality of the S3 service is actually a huge advantage, especially compared to current prices… Let’s talk about prices!
Costs
There are no fixed rules. With the Cloud, you pay for the resource by the hour and when you no longer use it, you stop paying. Instances can also be reserved for one year or three years: reserved instances: you pay a fee at the beginning, and then you pay the usage fee at a subsidized rate, which leads you to a point of no return from a certain
percentage of resource utilization over the course of the year (or three years).
To easily calculate how much your infrastructure will cost you, use the new Calculator provided by Amazon. In all cases a part is related
to hourly usage and another to traffic/volume stored.
You can compare the cost of your local infrastructure or your hosting
provider. The fixed price of the Cloud is particularly attractive in the following cases:
- The price set is unbeatable for ProofOfCconcepts, events/presentations or load/validation test architectures.
- It is very attractive for applications or APIs mounted on a SaaS type economic model and for which you need to spend money on resources only when the customer pays to use the above APIs.
- It’s a good price for social applications on Facebook, for example, which can take off overnight for social media phenomena and you can experience booms (or drops) in hits.
- It’s also convenient when you launch a new company, or a specific project within a larger entity and don’t want to invest heavily in logistics from the start.
For all the many other specific cases, you will have to make your own calculations.
Slowing down? Or, of course, not………
Usually, no matter what type of infrastructure you have, the same components and mechanisms should be installed. However, it must be recognized that often the welcoming aspects of a “home”-hosted infrastructure can lead to a lack of rigor on many issues. The fact that Amazon, with its AWS, offers a dynamic and volatile solution for these EC2s, forces you to install mechanisms (which should be standard) in order to consider a more serious failure or disaster recovery plan, given the volatile nature of the tool, and to identify important data with the aim of ensuring data durability (EBS, S3 backups, etc.)
Shared networks and resources
The network is an important element … and in more ways than one! In fact, the Cloud configuration is already prepared for you, and this is convenient. But this also means that you don’t
controls, and consequently it is not possible to control and therefore diagnose, for example,
the causes of a slowdown. There is a similar lack of transparency regarding the resources shared in a Cloud: at our level it is not possible to estimate the impact of other people who use the resource we share (such as the physical computer on which our EC2 instance is based, the physical device we use as a peripheral connected to the EBS network, the bandwidth network, etc.). The only possible monitoring is limited to the entry/exit functions on the instance (e.g., an EBS is a device attached to the network, so … there is no possibility of verifying the connectivity conditions, only with the I/O of the disks). The monitoring that can be done is focused on the EC2 virtual machine (the instance is managed by a hypervisor based on Xen virtualization). Total visibility of our infrastructure is therefore not possible on a cloud. This must be taken into account… and accepted, if you want to implement your architecture on the Cloud. You also need to assess the same lack of transparency for other shared resources as mentioned above (use of EBS, etc.)
Similarly, a multicast is not possible when it comes to communication protocols: remember for some clusters. This constraint is understandable, given the far-reaching impact that poor multicast management can have.
This is due to the way the Cloud operates: it has simple ways, which mask a certain number of elements that we will no longer control.
On call support, monitoring, BCP (Business Continuity Planning) and sanctions
A question I have often been asked “Does Amazon provide a call-support service (regarding your applications/infrastructure running on AWS)? “The answer is no. AWS It has to be seen as a set of tools offered by Amazon that ensures that these tools are always on that they work well. They support these tools and the development of their different functions. However you are responsible for using the tools (in any case they do not own the private keys ofthe EC2 instances…), so there is no monitoring/on-call support/BCP (Business Continuity Planning) package.
Unlike the specific contracts that you can sign with the hosting provider, you have to provide them with information about yourself, or regarding the oncall-support service for example, you can outsource it to management companies. Ditto for monitoring: Amazon offers the Amazon CloudWatch, but the information (% of CPU, read/bytes written to disk and in/out bytes of the network) is too concise for a proper check as expected by Centreon/Nagios, Cacti, zabbix or Munin. The CloudWatch
it is used by Auto Scaling, but it does not replace real monitoring. Some
Traditional hosting providers offer monitoring packages as their services.
As for the BCP and related penalties, the amount is to be hosted internally: you are responsible for the resources and how you manage failures / disaster recovery, in line with the capabilities of the tools (the AWS). This is where it’s important to understand the global nature of Amazon’s service architecture – if I don’t understand how the tool works, you won’t be able to implement an effective BCP. As for penalties, there’s nothing unusual about it: they simply mean a small bill for the month you’re under the ‘Service Unavailable’ category defined by Amazon’s criteria. This has nothing to do with penalties based on the sums lost due to unavailability.
It is absolutely necessary to consider Amazon web services as a tool. While there is AWS support (which you can call for questions and tool-level issues) that you can pay for, you will never get the full contractual potential that you would have with a more traditional hosting provider, and more generally, you are responsible for your architecture at all levels (including the security of your instances: of the loss of keys!).
Safety
Safety… often a taboo subject, as soon as you start talking about Cloud Computing. I do not mean the integrity of the data stored or even the management of access on virtual instances for which we are responsible, I am talking about the confidentiality of data stored on different services (S3, EBS, EC2, SQS, etc.) or data in transit between those services.
The first key point is that the level of security that Amazon’s datacenters are equipped with, not only physically but – equally important – in programmatic terms, will always be a long way ahead of your average
CED company, even the datacenter of the smallest of hosting providers. Firstly, because it is Amazon’s business: a security problem revealed in their infrastructure would have immediate consequences in terms of user reactions (and therefore in terms of business). This is therefore an essential detail, especially as Amazon has to prove itself in this sensitive area and they are therefore obliged to do their best to win over their customers. In addition, the very number of its facilities allows them to pool their investments in security and make them pay for themselves: this
It is not conceivable for smaller companies, or companies that do not specialize in it. Amazon therefore has the means and the obligation to ensure security.
What causes my skepticism is that the Cloud is not easily controlled. You have to have faith. It is no riskier than placing your trust in a traditional hosting provider, or in your in-house team…
But it’s brand new! So, beware! A normal reaction. But perhaps this is exactly the opportunity for us to work on safety at our level, something often overlooked, due to overconfidence or lack of interest. The first task is to encrypt information: stored data as well as data in transit. Remember to take into account the encryption/decryption CPU load. The second
is to fully understand the security mechanisms of the various Amazon services:
AWS Multi-Factor Authentication
- Access Credentials: Access Keys, X.509 Certificates & Key Pairs
- Sign-In Credentials: E-mail Address, Password & AWS Multi-Factor Authentication Device
- Account Identifiers: AWS Account ID & Canonical User ID
Then you will have to select the personnel who will be authorized to access the different security keys.
Conclusions
The evolution of functions in infrastructure management can be clearly seen in this first part: from the management of physical resources by means of APIs, mechanisms that guarantee the durability of data,
availability of services, etc. up to the provision of server power and physical security of the datacenter, all of which are transparently supported.
Final result: we “only” have to use APIs that talk to a remote server. This is the difference with physical infrastructure. Virtualization (which is an aspect of AWS) that we know and has already been used for some time, is used by Amazon: it is not so much a technical revolution – although I do not deny the complexity of the implementation and the support that there is in it – as the service offered with it, which provides real added value. It is combined with a new ‘ pay-as-you-use ‘ aas (as a service) economic model. This has allowed the emergence of some applications (such as games found on social networks), which only a few years ago would have otherwise been compromised by the initial investment.
The services provided by Cloud Computing inevitably bring with them a certain reduction in the degree of control and visibility over some parts of the infrastructures, in particular on the network. This is the price you owe
pay, whether it is completely negligible, or really problematic: it all depends on your needs.
AWS should be seen as a complete tool, but it does not free us from having to follow all the best practices or obtain all the standard components of an infrastructure: log servers, monitoring, BCP, configuration manager, etc. All these elements are and will continue to be your responsibility. Don’t have too naïve expectations: as AWS offers Haas and IAAS, you still need a competent systems administrator, and one in particular who fully understands the architecture of AWS (otherwise you might be disappointed) – if you switch GAE (Google App Engine), you will still need an architect who fully understands the architecture of GAE, etc. The business is
constantly evolving.
As far as AWS security is concerned, I am reasonably confident. First of all, it should be emphasized that information and data are probably less secure within your company than entrusted to Amazon (anyway in most cases – I don’t generalize). AWS’ exposure to the Internet and Amazon’s commitment to business means that Amazon takes security very seriously. Also, you are responsible for a large part of this security (key management, etc.) and believe me, that is definitely the riskiest feature. When it comes to data transfer and storage, think of “encryption.”